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Abstract —In this paper, we look to address the problem 
of estimating the dynamic direction of arrival (DOA) of a 
narrowband signal impinging on a sensor array from the far 
held. The initial estimate is made using a Bayesian compressive 
sensing (BCS) framework and then tracked using a Bayesian 
compressed sensing Kalman filter (BCSKF). The BCS framework 
splits the angular region into N potential DOAs and enforces 
a belief that only a few of the DOAs will have a non-zero 
valued signal present. A BCSKF can then be used to track the 
change in the DOA using the same framework. There can be 
an issue when the DOA approaches the endfire of the array. In 
this angular region current methods can struggle to accurately 
estimate and track changes in the DOAs. To tackle this problem, 
we propose changing the traditional sparse belief associated with 
BCS to a belief that the estimated signals will match the predicted 
signals given a known DOA change. This is done by modelling 
the difference between the expected sparse received signals and 
the estimated sparse received signals as a Gaussian distribution. 
Example test scenarios are provided and comparisons made with 
the traditional BCS based estimation method. They show that 
an Improvement in estimation accuracy is possible without a 
significant increase in computational complexity. 

Index Terms —DOA estimation, Bayesian compressed sensing, 
Kalman filter, dynamic DOA, DOA tracking 

I. Introduction 

Direction of arrival (DOA) estimation is the process of 
determining which direction a signal impinging on an array 
has arrived from HI. Commonly used methods of solving this 
problem are; MUSIC HI, E) and ESPRIT H, Q. However, 
these methods have two drawbacks: Firstly, we need some 
knowledge of the number of signals that are present. Secondly, 
evaluation of the covariance matrix is required, thus increasing 
the computational complexity required to solve the problem. 
This covariance matrix is estimated form the signals received 
by each sensor at different time snapshots. Instead, if we 
consider the fact that only a few of the potential DOAs will 
have a signal present then we can consider the problem from 


the view point of compressive sensing (CS) and work directly 
with the received signals. 

CS theory tells us that when certain conditions are met it 
is possible to recover some signals from fewer measurements 
than used by traditional methods Ih), Q. This can be applied 
to solve the problem of DOA estimation fSl, 13, ifTOll . First 
split the angular region of interest into N potential DOAs, 
where signals actually impinge on the array from only L 
(L << N) of these directions. The problem can then be 
formulated as finding the minimum number of DOAs with 
a signal present that still gives an acceptable approximation 
of the array output. Those directions that have the non-zero 
valued signals are then used as the DOA estimates. It is also 
possible to convert this problem into a probabilistic form 
and solve using a relevance vector machine (RVM) based 
approach HD, na, HI- It has been shown in the case of 
static DOA estimation that methods based on this approach 
offer encouraging results HD- 

Less work has been done on the problem of estimating 
a dynamic DOA. One option is to use particle filters or 
probability hypothesis density (PHD) filters. These filters have 
been used in the areas of DOA estimation and tracking of 

sources HI, HD, HD, ED- 

Alternatively, in lfT3 the authors track a dynamic DOA with 
a Kalman filter (KF) and narrow the angular region being 
considered to focus in more closely on the DOA estimate from 
the previous iteration. However, this removes the advantage of 
being able to directly work with the measured array signals 
and introduces an additional stage of having to reevaluate the 
steering vector of the array at each iteration of the KF. 

Bayesian Kalman filters (BKF) have been used to track 
dynamic sparse signals lf20l . where the predicted mean of 
the signals at each iteration is taken as the estimate from 
the previous iteration and the hyper-parameters (precision) are 
estimated using BCS, hence the term Bayesian compressed 


sensing Kalman Filter (BCSKF). There are still some issues 
with this method when applied to the problem of DOA 
estimation of a dynamic far field source using uniform linear 
arrays (ULAs). Namely when the DOA approaches the endfire 
region (i.e. the signal arrives parallel/close to parallel to the 
array), the estimation accuracy can degrade. This means that 
it is possible to initially have an accurate estimate and then 
not track the changes in DOA properly. 

In this paper, to solve this problem, we propose modifying 
the BCSKF to include information about what the received 
array signals would be expected to be at a given time snapshot. 
This is done by changing the distribution used to model the re¬ 
ceived signals. Now instead of assuming a zero-mean Gaussian 
hierarchial prior we assume that the mean is instead centred 
at the value that is expected given the previous snapshot’s 
estimate and the expected change in DOA. As a result we 
derive a new posterior distribution and marginal likelihood 
function that can be used to solve the DOA estimation problem 
by following a similar framework as used for the RVM. 
We term this framework the modified RVM, which is used 
to find an initial estimate of the DOA at the first snapshot 
and then at each subsequent snapshot to optimise the noise 
variance estimate as well as the hyperparameters required by 
the BCSKF. 

The remainder of this paper is structured in the following 
manner: Section m gives details of the proposed estimation 
method. This includes the array model being used (lITAb . the 
modified RVM framework for BCS (III-BIl and the BCSKF 
(III-CI) . In Section HIH an evaluation of the effectiveness of the 
proposed method is presented and conclusions are drawn in 
Section |IV] 

II. Proposed Design Methods 
A. Array Model 

A narrowband array structure consisting of M sensors is 
shown in Fig. [T] The sensors are assumed to be omnidirec¬ 
tional with identical responses. A plane-wave signal model 
is assumed, i.e. the signal impinges upon the array from the 
far field at an angle 6 as shown. In this work we assume 
that 0° < 0 < 180°. The distance from the first sensor to 
subsequent sensors is denoted as dm for m = 1,2,..., M, 
with di = 0, i.e. the distance from the first sensor to itself. 
Note, these values are multiples of a uniform adjacent sensor 
separation of Ad. 

The steering vector of the array is given by 

a{n,e) = ( 1 ) 

where il = ojTg is the normalised frequency with Tg being the 
sampling period, fj,m = ^ for m = 1,2,..., M, and 
denotes the transpose operation. Note, the steering vector of 
an array gives contains information about the array geometry. 


dyi =(M—1 )Ad 



Fig. 1. Linear Array structure being considered, consisting of M sensors 
with a uniform adjacent sensor separation of Ad. 

namely the sensor locations and the delay required for a signal 
to reach a given sensor. 

The output of the array, y^, at time snapshot k is then given 
by 

y^=Axk+nk, ( 2 ) 

where = [xk,i,Xk, 2 , ■■■,Xk,N]'^ S gives the received 

signals, n* = [nk,i,nk, 2 , G £Mxi censor 

noise and A = [a(r2,0i), a(r2, ^ 2 ), ^w)] G is 

the matrix containing the steering vectors for each angle of 
interest. 

B. Bayesian Compressed Sensing for DOA Estimation 

First split the angular range that is being monitored into 
N potential DOAs. Each direction can then be considered 
as having a signal present. However, only L « N of the 
received signals are non-zero valued, with the directions of 
these L signals giving the actual DOAs. 

Now we split (|2l) into real and imaginary components in the 
following way 


y^, = Axfe -f fife (3) 


^(yfc) ■ 


■ 7^(A) -1(A) ■ 

■ 7e(xfe) ■ 

_ 1 _ 

■ 7^(nfe) ■ 

Ask) . 


_ 1(A) 7^(A) 

2:(xfe) 


. A'^k) 


where TZf) and !{■) give the real and imaginary components 
respectively. Note, the difference between y^ and y^ is that y^. 
has been split into its real and imaginary components in yfe. 
As a result the dimensions of yj. are larger than for yfe, but 
we are now only considering real valued data. Similar is true 
when comparing A and A, and Xfe and Xfe and nfe and fife. 

The aim is to now find a solution for Xfe that gives the 
minimum Iq norm, i.e. the minimum number of non-zero 
valued signals. This is done by evaluating the following 

Xfe,opt = max T’(xfe,cr^,p|y,Xe), (4) 



























where tr is the variance of the Gaussian noise n, p = 
[pi,P 2 , ■.■,P 2 n]'^] contains the hyperparameters that are to be 
estimated and Xg holds the expected values of x^. 

To do this first obtain the following from (|2]i; 

^(yfc|xfe,cr^) = (27rcr2)-^exp| - - Axfc||i|. (5) 

Now exert a belief about the values of x^. that are expected 
by enforcing 

7^(xfe|p,xg) = (27r)-'^|Pp/2 ( 6 ) 

X exp| - -(Xfc -Xe)P(Xfe -Xe)'^|, 

where |P| indicates the determinant of P, where P = diag(p). 

It is also necessary to define the hyperparameters over p 
and cr^. There are various possibilities for the structuring of 
the priors on p, which represent mixing parameters in a scale 
mixture of normals representation of the marginal distribution 
of Xfe, which will here be in the Student-t family, see e.g. mil. 
One possibility would be to treat the complex components 
of Xfe as complex Student-t distributed, as detailed in ll22ll . 
1231. Here though we treat the real and imaginary components 
of Xfe as independent Student-t distributed random variables, 
and hence have independent Gamma priors for the mixing 
variables pn over all real and imaginary components of Xfci 

2N 

V{ip)=X{G{pn\Pi,h)- (7) 

n—1 

A Gamma prior can also be used for 

7’(a2)=G(a-2|/33,/34), ( 8 ) 


Note, the maximum of (fTol l is the posterior mean /r. 

Similarly to El, the probability T’(cr^,p|y^.) can be repre¬ 
sented in the following form: 

7^(o■^P|yfc) ~ 7^(yfe|P,c^^Xg)7?(p)T’(cr2) (13) 


and the second two terms on the right of (fOl l become constant 
with uniform scale priors, then maximising p|y;.) is 

roughly equivalent to maximising 7^(yj,|p, cr^,Xe). This can 
be achieved by a type 2 maximisation of its logarithm, which 
is given by I 

£(p,a2) = log|(2^a2)-^|S|5|P|^exp(-i (14) 

X (y^By^ -f x^Cxg - 2cr2y^ASPxe)) | 

= —i^2Mlog(27r)-|-2Mlogcr^ — log|S| — 
log|P|+a- 2 ||y,-A/r||i + /r^P/x 
-fxfPXg - x^P/r^, 

where B = (aH -f AP"^A^)-i and C = P - P'^SP. 

To do this (fl4l i is differentiated with respect to Pn and a~^ 
to obtain the update expressions 

-, (15) 

- Xe,nPn 


where 7 „ 
S and 


1 — Pn'Snn, '^nn Is the diagonal element of 


a 


2 

new 


Ilyfc - A/x||i 

2 M-^ 7 „ 


(16) 


where Pi, (32, Ps and ^4 are scale and shape priors. Note, when 
Xg = [ 0 , 0 ,..., 0 ]^ then (| 6 ]l reverts to the traditional hierarchial 
prior used in BCS ifTTIl . 

We know that 


7’(xfc,c^^p|yfe,Xg) = T’(xfc|yfe,cr^p,xg)7’(p,cr2|yfc) (9) 

and 0 

7^(yfe|xfc,o-2)T’(kfc|p,Xg) 


7’(xfe|yfe,p,cr ,Xg) = 


7’(yfe|p,o•^Xe) 
= (27r)-^|S|-i/2 


( 10 ) 


X exp ■( --(xfe -S (xfc - pt) 


The maximisation is then achieved by iteratively maximising 
([IT]l and (dll and (115b and (116b until a convergence criterion 
is met ifm . IIT 2 I . Note that when Xg = [0,0,...,0]^ the 
update expressions match that of the traditional RVM. The 
final estimate of the received signals is then given by 

Xfc,opt = -fPopt^ ^ + PoptXg^ (17) 

where and Popt = dmg{[popt,i,Popt, 2 , ■■■,Popt, 2 NV) are 
the result of optimising the noise estimate and hyperparameters 
respectively. 

The final estimated signals are then given by 


where the covariance matrix is given by 


Xk,opt,n — Xk,opt,n + jXk,opt,N+n- 


(18) 


S = (a- 2 A"’A-fP)-i, 

( 11 ) 

and the mean given by 


H = S(CT"^A’^yfc -l-Pxg). 

( 12 ) 

^See Appendix A 



Thresholding can then be applied to keep the L most sig¬ 
nificant signals as in m. To do this find the total energy 
content of the estimated received signals and then sort them. 
A threshold value, 77, is then defined as a percentage of the 

^See Appendix B 
^See Appendix C 







energy content that is to be retained. Starting with the most 
significant estimated signal, the estimated signals are summed 
until the threshold is reached and the remaining signals set 
to be equal to 0. The remaining non-zero valued signals then 
give the DOA estimates and L as an estimate of the number 
of far field signals impinging on the array. 

C. Bayesian Compressed Sensing Kalman Filter 

In order to track the changes in the DOA estimates at 
each time snapshot the BCS based DOA estimation procedure 
detailed above is combined with a BKF, giving a BCSKF for 
DOA estimation. Here, the signal model described above is 
again used along with the prediction 

Xfc|fc-i = Xfc_i|fe_i + Ax + Pj. ^ 

yfcifc-i = Axfc|fc_i Ye.fc = yfc-yfe|fe-i (i9) 

and update steps 

— X/e|/c—1 e^k '^k\k “ 

K/c = (ct^I- f AS^.n._iA ) ^ (20) 

of the BKF. Here, k\k — l indicates prediction at time instance 
k given the previous measurements and Ax is determined by 
the expected DOA change. For example, if we sample the 
angular range every 1° and the the DOA increases by 2° then 
then Ax will be selected to increase the index of the non¬ 
zero valued entries in Xfe_i|fc_i by two to give the index of 
the non-zero valued entries in X;;H._i. In this work we have 
assumed that there will be a constant change in the DOA. 

At each time snapshot it is necessary to estimate the 
noise variance and hyperparameters in order to evaluate the 
prediction and update steps of the BCSKF. This is done by 
considering the log likelihood function given by 

= -^(2Mlog(27r)-f 2Mlogcr^ - (21) 

log|S| -log|P| -f cr-^||yg_fc -A/x||^ 

-l-pt^P/X-t-Xfe|fe_iPXfc|fc-l 

which can be optimised by following the procedure described 
in Section lTl-BI Here we have used the Kalman filter prediction 
^k\k-i the expected estimate values Xg. 

It is worth noting that the continued accuracy of the pro¬ 
posed BCSKF relies on the accuracy of the initial estimate 
and the parameter values selected. If the initial estimate 
(made using the framework described in Section III-BI and 
Xe = [0, 0,..., 0]^) of the received signals is accurate and 
sparse, then the priors that are enforced will ensure this 
continues to be the case. However, an inaccurate initial DOA 
estimate or poorly matched expected DOA change can lead 
to the introduction of inaccuracies in subsequent estimates. 
Similarly, if the initial estimate of the received signals is not 


TABLE I 

Performance summary for the endfire region example. 


Method 

Mean RMSE 
(degrees) 

Mean Computation 
Time (seconds) 

RVM 

11.03 

0.35 

Modified RVM 

0.98 

0.55 


sparse then subsequent estimates are likely to not be sparse. 
As a result, care should be taken when choosing the initial 
parameter values and determining the likely DOA change. 


HI. Performance Evaluation 


In this section a comparison in performance of the tradi¬ 
tional RVM and the proposed modified RVM will be made. 
Three example scenarios will be considered. One where the 
initial DOA starts outside of the endfire region and then moves 
into it, one where the DOA remains out of the endfire region 
and finally one where the initial DOAs and signal values are 
randomly generated. All of the examples are are implemented 
in Matlab on a computer with an Intel Xeon CPU E3-1271 
(3.60GHz) and 16GB of RAM. 

The performance of each method will be measured using 
the root mean square error (RMSE) in DOA estimate. This 
is given by 


RMSE = 




E 


9=1 


Q 


( 22 ) 


where 9 is the actual DOA, 6 is the estimated DOA and Q 
is the number of Monte Carlo simulations carried out, with 
Q = 100 being used in each case. 

The array structure being considered is a 20 sensor ULA 
with an adjacent sensor separation of We assume the actual 
noise variance is given by cr^ = 0.4 and an initial estimate of 
the noise variance of = 0.1 used when initialising the 
RVM and proposed modified RVM. 


A. Endfire Region 

Eor this example the initial DOA of the signal is 0 = 20°, 
which then decreases by 1° at each time snapshot. The signal 
value at each snapshot is set to be 1. Table U summarises the 
performance of the two methods for this example. Here we can 
see that there is in an improved accuracy in the DOA estimates, 
as the mean RMSE has decreased for the proposed modified 
RVM method. This is also supported by the overall RMSE 
values as illustrated in Eig. |2] It is also worth noting that the 
mean computation times show that this improvement has not 
been at the expense of a significant increase in computational 
complexity. 
















Time Snapshot 


Fig. 2. RMSE values for the endfire region example. 

TABLE II 

Performance summary for the non-endfire region example. 


Method 

Mean RMSE 
(degrees) 

Mean Computation 
Time (seconds) 

RVM 

5.59 

0.33 

Modified RVM 

0.36 

0.41 


B. Non-Endfire Region 

In this instance the initial DOA is 0 = 100° with the DOA 
increasing by 1° at each time snapshot, with the signal value 
remaining constant at -1. The performance of the two methods 
is summarised in Table HU with the RMSE values illustrated 
in Fig. [3 Again this illustrates the improved performance 
offered by the modified RVM has not been at the expense 
of a significant increase in computation time. 

C. Random Initial DOA 

Finally, we consider the case where the initial DOA is 
randomly chosen from the entire angular range and increased 
by 1° at each time snapshot. The signal value is randomly 
selected as ±1 for each simulation and remains constant as 
the DOA changes. As for the previous two examples Table. 
m and Fig. |4] indicate that the proposed modified RVM has 



Time Snapshot 

Fig. 3. RMSE values for the non-endfire region example. 


TABLE III 

Performance summary for the random initial DOA example. 


Method 

Mean RMSE 
(degrees) 

Mean Computation 
Time (seconds) 

RVM 

10.98 

0.34 

Modified RVM 

3.52 

0.43 



Time Snapshot 


Fig. 4. RMSE values for the random initial DOA example. 


obtained an improved accuracy without a significant increase 
in computational complexity. 

IV. Conclusions 

This paper proposes a BCSKF to estimate the DOA of 
a single signal impinging on a ULA from the far field. A 
new posterior distribution and marginal likelihood has been 
found and unlike traditional BCS the expected values of 
the estimates are accounted for. This is done to combat the 
problem of inaccurate DOA estimates when the actual DOA 
approaches the endfire region of the angular range. Then a 
similar optimisation framework to what is used in the RVM 
is applied to find the optimal hyperparameters and noise 
variance estimate, which are then used to estimate the received 
array signals. Example test scenarios have shown the proposed 
modified RVM is more accurate in not only the endfire region, 
but also in the entire angular region as a whole. This is also 
without a significant increase in computational complexity. 

Appendix 

A. Derivation of Posterior Distribution 

From Bayes’ rule we know that 

^(Xfc|yfe,P>Cr^,Xe)T’(yfc|p,CT2,Xe) = (23) 

^(yfe|Xfe>Cr2)^(Xfe|P,Xe), 

where 7^(yj,|xfc, cr^) and 7^(x/c|p,Xe) are known from (|5]l and 
©, respectively. 

Now following the method suggested in ina carry out the 
multiplication on the right hand side, collect terms in x^ in 










































the exponential and complete the square. 
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2 L 


o- (yfc-Axfe) (yfe-Axfc)- 
(Xfc - Xe)^P(Xfc - Xe) 


^yfeAxfe-cr ^XfcA'y^ 


- 2 ~T 


- 2 ~T : 


2 L 


(T ‘^xTA Axfc + Xt PXfc — Xt Px, — xi Pxi. 


1 ■ 
2 . 


+xJPXe 


(xfe-/r)-'S ^ (xfc -/x) -S V 


^yfeyfe + xJPxe 


CT ^yfcyfe+x^Pxe-/x^ S V 


Tv^-l. 


B. Derivation of Marginal Likelihood 
From (l27b we know that 

'Dd, ^ t - ^(yfc|Xfc,g^),:P(Xfc|p,Xe) 

^(yfc ;^ej l~ 2 t ’ 

meaning the term in the exponential will be (l25T l where 

/x'^S’V = (CT"^A^yfc +PXe)'^S^S~^ 

X S(cr"^A^y^.+Pxe) 


= (ct ^A’^yfe+Pxe)^(cr ^SA^y^ + SPxe) 


= cr -‘y^ASA* y^^ + cr ^y^ASPxe + 
cr'^x^P'^A'^yj;, +x^P^SPxe. 

Therefore the exponential term is given by 


-2^,T ; 


This gives the marginal likelihood as 


(24) 


^(yfclp-cr >Xe) = (27rcr ) 


2\-M 


S|2|P|5 


(30) 


X exp 


{ - ^{fk^yk 


■ xi Cx. - 


2 cr^y^ASPx. 


J}. 


where B and C are defined as in Section III-BI The log 
likelihood is then given by 


. 2 ^-M 


>C(P,o-^) = log|(27rcr^) 

'{ - ^[y^By 


X exp • 


S|5|P|2 

xTCxp. - 


(31) 


2 a 2 y^ASPx 


where S and /x are given by (fTTT) and (fT^ . respectively. This 
then gives the posterior distribution as (fTol) . with the remaining 
exponential terms 


J}} 


= —Mlog(27r) — Mlogcr^+ -log |SI + 


ilog |P| - ^[ylByfc 


■ xi CXe - 


(25) 


2 cr^yfcASPxe]. 

Using the Woodbury matrix inversion identity we have 
B = cr“^I — cr“^A(P + (T“^A A)“^A cr“^ 


(26) 


(27) 


which means we have 


yfeByfe = yfcC^ a 


(32) 


(33) 


X (P + a-^A A)-^A a-^)y, 

= yfc c^“^yfe - yl AsA’^cr-^y^ 

= cr'^yl (yfe - A/x) + a-^flA-EPxe 
= CT“^||yfe -A/x||^+ /x^P/x + cr"2yfcASPxe. 


Also, we know that P^ = P as P is a real valued diagonal 
matrix. This means 


xjCXe = xj [P - PSPjXe 

= xTPXp — xJPSPXp 


(34) 


= xJPXe -X^P/X + Cr ^y^ASPXe, 
which then gives the log likelihood function in (fT4l i. 

C. Derivation of Update Expressions for Modified RVM 
Firstly, differentiating with respect to pi gives 


CT ^yfcyfc+x^Pxe - cr ^yfcASA^yfe- (28) 

a-^y^ASPxe-a-^xfP^A^yfc- 

xJP^SPxp 


V _ ^ I ,,‘2 , ^2 


—‘nn “t“ „ Xe nPn 

2 1 Pn 


(35) 


fi - ^-"ASA ]y, +xi[P- P^ SP]: 


-a- 2 y^ASPXe-(T-^x^ P^ A' yfe 
The term outside of the exponential is given by 

( 2 ^(t 2 )-^( 2 ^)-^|P| 1/2 


and equating to zero gives 
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(27r 


l-ATI 


= (27ra") 


2 ^-M 


S|5|P|2. 


(29) 


y^nn T Bn 4” n Xe npri — 0 (36) 

Pn 

1 Pn^^nn PnPrt P^^e^n 4” PnXe,npn — 0 

Tn Pn\Pn 4“ ^e,n ^e.nBn] — 0 


which leads to (fTSt . 




















Now collect the terms with a in to give 

-i 2Mlogcr2-log|S|+cr-2||yfe-A/x||^ 
and then define t = giving 

2MlogT-i-log|S|+T||yfc-A/r||^ 
= -^[-2Mlogr-log|S|+T||yfc-A/r||^ 


(37) 


(38) 


Now differentiate (l3^ with respect to t and equate to zero to 
give 

2M ~T ~ 

-+ tr(SA A) + I ly^ - A/r11^ = 0, (39) 

T 

where tr(-) indicates the trace. As tr(EA A) can be written 
as ^ 7 „ we now have 

n 

t-1 (2M-^7„) = Ilyfe-A/r||^, (40) 

n 

which in turn gives (fThl l. 
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